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TEMPORAL CROP SIGNATURES 


The Kansas Intensive LAC IE test site data consists of three to five 
LANDSAT images of each site taken in the fall, early spring, and early 
summer. We have treated this mul ti.- temporal data set in a classical way 
with non-parametric supervised decision rules and have determined poor 
results from crop discriminations. Some of these results were reported 
last quarter. 

During this quarter, we have been discovering the reasons for poor 
performance. We have done careful comparisons of the category statistics 
and cross compared these statistics with those obtained from our unsuper- 
vised' clustering approach. 

1 . Signature Comparison' 

Temporal data were collected and graphed by the ground truth cate- 
gories for Finney and Morton Counties. The mean gray tone values were 
computed by band and category and were plotted. These -plots were pre- 
sented in Figures 1.1 and 1.2, The plots for ground truth categories 
are easily discriminated for the Finney County test site. However, in" 
Morton County the plots are very similar except for wheat and grass. 

One would expect that corn, summer fallow, grain sorghum, and rye would be 
easily confused. . Looking at the same category in both counties, we see 
that the plots are not similar at all. 

For example, v;heat in Finney County tends to have a wider range of 
values. Band 6 for wheat increases, decreases, then increases again In 
Finney County, whereas it decreases, then increases in Horton. Band 7 
decreases in brightness after April in Finney County, but in Morton 
County band 7 Is increasing after May 9. 

Clearly it would be difficult to use data from one test site- to 
discriminate categories in another. 

To see what regions the spatial clustering procedure was finding, 
the temporal plots for 6 clustered regions were produced. These are pre- 
sented in Figures 1.3 and 1.4, The plots for the first clusters in 
Figure 1.3 match very closely with the ground truth category plots for 
Finney County in Figure 1,1. Cluster 1 matches with v/heat in Figure 1.1; 
Cluster 2 matches with corn, etc. As the cluster number increases, it is 
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more difficult to identify the ground truth category. Cluster 7 from 
Figure 1.3 is probably corn, but one is not absolutely sure. 

The clusters for Morton County are harder to identify. Clusters 1 
and 5 in Figure 1.4 match with wheat in Morton- County Figure 1.2. The 
remaining clusters could be assigned to any of corn, summer fallow, grain 
sorghum or rye. 

These huge differences for data between counties and the generally 
large variance for the data for any one category prompted the idea that 
some ground truth labels were in fact wrong and/or that there was a large 
difference in the reflectance of crops due to different fields being in 
different growth stages. To test the idea that differences in growth 
state can increase the variance for a ground truth sample, the following 
experiment was performed. Each wheatfield in the ground truth was given 
a unique label. The mean and standard deviation for each field v/ere com- 
puted and the mean plotted. The graphs of the mean gray tone for 6 fields 
seem to fall into 2 categories: fields 2, 3, and 10 are similar, as are 
1, 18, and 20. Table 1 presents the standard deviation by band for all 
wheatfields and six selected v;heatf ields. In all cases, the standard 
deviation for the individual fields is less than (by factors of 3 to 10) 
the standard deviation computed over all fields of the same category. 
Difference in growth stage is a major contribution to the_category confu- 
sion problem. 

Throughout these experiments we have been concerned with the validity 
of the ground truth data. One method for testing the ground truth data is 
to compare the clustered regions with the ground truth. Figure 2.2.5 shows 
the original LACIE ground truth for Morton County. The clustered data is 
shown in Figure 2. 2. 2. Cl ustering was performed on all dates/bands 5 and 7 
and using dates/bands: 0CT23/5 OCT23/7 MAY9/5 HAY9/7 for gradient 

information. To see the confusion shown by the cluster, we can associate 
the cluster label with the ground truth label it covers. We can represent 
this association by the graph shown in Figure 1.8. it can easily be seen 
that almost every cluster covers more than one ground truth category. To 
see what causes this, we can look at the fields that produce each of the 
edges from a cluster and the edges to a ground truth category. For exam- 
ple, a field labelled wheat and cluster 1 has a mean gray tone temporal 
plot shown in Figure 1,9. Another wheatfield, laoelled cluster 2, has the 
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Figure 1.5 Graphs of mean grey tone of 6 wheat 
fields in Morton County. 
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Table 1. Standard deviation of all wheatfields and six individual v^heat 
fields in Morton County. 




A - Wheat 
B - Grass 
C - Corn 

D “ Summer Fallow 
E - Grain Sorghum 
F - Rye 


Figure 1.8 Graph showing association of 5 clusters 
with ground truth categories for 
Morton County. 
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Figure 1.9 Mean grey tone for wheat and cluster 1 
in Morton County. 
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Ftgure 1.10 Mean grey tone for wheat and cluster 2 
in Morton County. 
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Figure 1.11 Mean grey tone for wheat and cluster 1 
in Morton County. • 
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Figure 1.12 Mean grey tone for summer fallow and 
cluster 2 in Morton County. 
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Figure 1.13 Mean grey tone for corn and cluster 2 
in Morton County. 
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plo-tshown in Figure 1.1,0, and a third wheatfield with cluster label 1 
has the plot in Figure 1.11. These three graphs are marked by different 
characters. This difference can be explained by either the ground truth 
being incorrect or the fields being at different points in the "normal" 
growth pattern" when the imagery was acquired. If we assume that we are 
seeing different points on a "normal" growth curve, then this data can 
be used to construct such a growth curve. On the other hand, if it seems 
that the ground truth is incorrect, we can correct the ground truth and 
perhaps obtain better discrimination results.- 

We can also fix our attention on a single cluster label. Figures 
1.10, 1.12, 1.13 and from fields labelled cluster 2 and are ground truth 
categories wheat, summer fallow, and corn, respectively. These plots, 
from the same cluster, are similar. The plots for a given cluster label 
are not alv/ays so similar and in some cases vary as much as those shown 
above for wheat. This indicates that the clustering process has perhaps 
gone too far in its grouping together. 

2.0 Unsupervised Clustering 

The last report included a qualitative description of the unsuper- 
vlsed clustering approach. It was mentioned that because of the high 
error rate in the classification of some crops, the ground truth informa- 
tion on some fields was suspect. In the last quarter, an attempt was made 
to overcome this problem, using a couple of techniques. Unsupervised 
clustering was one of them. In this section, we will describe some of the 
results obtained by this process. 

The procedure, strictly speaking, is not completely unsupervised. 
There are a few parameters (described below) which the user enters. These 
serve to control the degree or level of clustering. In that sense, the 
user does supervise the process. However, when compared to the Bayes or 
discriminant analysis classification, the meaning of unsupervised becomes 
clearer. The latter two procedures require a prior ' knov'/l edge of some 
ground truth data. That is, in order to do any classification, they need 
some training sites to work on. The classification is supervised by these 
training sites, along with some control parameters specified by the use'r. 

Unsupervised clustering, on the other hand,, groups like things to- 
gether, without any regard to what they are. This makes the unsupervised 



clustering very easy to perform, as all that is needed is the original 
multi band image. Only at the end of the processing do we need any ground 
truth information. This is not for training sites, but for identification 
and measuring classification accuracy. Thus, i-f parts- of an image are 
identified to correspond to certain crops or land use, the clustered 
image can be used to check the ground truth data that seemed faulty. 

In order to understand some of the results of this procedure, the 
various steps of the process are described below. Basically, it is a two- 
part procedure. In the first part, usually referred to as spatial clus" 
tering, an image is generated, which makes and delineates the different 
homogenous areas of the input image. The process uses the spatial infor- 
mation surrounding a cell to determine these areas. In the second part, 
or measurement space clustering, these homogenous areas are grouped to- 
gether according to their spectral signatures. Let us examine the oper- 
ations involved in these two stages. 

Usually before any spatial clustering, some preprocessing of the 
image is done. This involves quantization of the image, follov/ed by con- 
trast enhancing. The preprocessing options are performed to increase the 
effect of the first spatial clustering operation — generating a gradient 
image. The image gradient operation is defined on a multi band image and 
serves to distinguish between the spectrally homogenous and boundary areas 
in the image. The preprocessing options tend to make the edges between 
regions sharp. 

It is also important here that the bands of the original image be 
registered correctly. The better the registration, the sharper the 
boundaries on the gradient image, and the better the definition of the 
homogenous regions. Unfortunately, the registration between bands from 
different dates was riot very good. Thus, in the first run-through, the 
results from this operation were poor. The problem was partially over- 
come by not using all the bands, but only some selected ones, either the 
ones showing the least misregistration between dates, or bands from one 
date only. 

The gradient operation, like its calculas counterpart, generates an 
image in which boundaries or points of rapid change show up as high values. 
The interior of the fields in which there is little change in the gray 
tones, show up as low gradient values. Thus, by thresholding the image. 



jtils possible to separate boundary cells. 

There are various kinds of gradient operations available. We have 
used. a Roberts gradient throughout. 

The thresholding of the image is done using a fraction of its runn- 
ing mean as the cutoff. The running mean of a row is defined as the mean 
of 10 rows above and below it, By changing the fraction, one can raise 
or lower the cutoff, and control the degree of thresholding. For most 
cases, this fraction was set to 1.0. Thus the running mean of the row 
became the cutoff. There are other methods available for thresholding. 

This scheme was used because it is fast and works well enough on images > 
of the sizes involved. 

After thresholding, the cells v;hlch fall below the cut-off are con- 
sidered as homogenous cells. This resulting image is usually very noisy, 
as it is hard to get very sharp gradient images, despite all the pre- 
processing. A "cleaning" operation is then carried out.. It removes 
isolated cells and the "salt and pepper" effect which constitutes the 
noise. The result is an image with the different homogenous areas sep- 
arated out and the boundary cells set to zero. 

These areas can be expected to correspond to fields in an image. 
However, in many cases, a field may come out subdivided Into more than 
one homogenous region. This happens if the process does not find the 
field spectrally homogenous. For example. If half the field has been 
watered, the wet and the dry areas may come out separated. 

In the next step, for purposes of identification and clustering, each 
separate homogenous area is marked with a unique label. The last image 
usually constitutes the end of the spatial clustering section. However, 
if it is felt that some of the regions were not separated out sufficiently, 
a "splitting" operation is available. This operation takes a region marked 
homogenous and splits it into two or more sections, if it determined the 
region was not homogenous enough. In this process, the operation examines 
both the labelled file and the original multiband image file. 

The idea behind separating each homogenous region is that we want to 
cluster only on the interior of the fields. The signature from the core 
of a field is more uniform, and a better representation of the field, 
than a signature obtained by including boundary or edge cells. The latter 
reflect a mixture of different classes and is prone to give rise to error 



In the classification. 

The first step in the second stage is to generate the spectral signa- 
tures for the homogenous regions. For this we go back to the multi band 
image. The signature for a region is def ined as^jh_e a.v:e rage. .gray, tone- 
for that region, for each of the bands of the original image. Thus, for 
a multiband image. we have a spectral signature vector for each region, in 
addition, an area count is generated for each region. This is used to 
weight the signatures when regions or clusters are grouped together. 

The measurement space clustering is a pure. clustering procedure. 

While it is not limited to image clustering, it has been implemented to 
cluster spectral signatures. The procedure is an iterative one. The dis- 
tance used for measuring closeness" between clusters is the Euclidean dis- 
tance defined on the measurement space dimensions. Within an iteration, 
the procedure finds the closest neighbor for each signature. These pairs 
•of closest neighbors are linked together to form groups. For this, the 
user enters two parameters to insure that closest neighbors, which may be 
too far apart, are not cl ustered together. After^ clustering, a new table 
is generated, which consists of the signatures of the new clusters obtained 
in the last iteration. 

The iterations are repeated- unt i 1 the number of clusters Is reduced 
to what is felt adequate. For example, for the Rice County image, 6 clus- 
tering iterations were carried out. !n these, ^09 homogenous .regions were 
reduced to 37 clusters. 

Usually the number of clusters is left at a number larger than the 
.number of categories expected. It is done to prevent overclustering. In 
almost ail images, there are a' few isolated regions with outlying signatures. 
The way the process is set up, these come out as isolated clusters. They 
don't normally define a well-known land use class and can be ignored. The 
discarding of clusters is made after a visual examination or a check on 
the relative size of the clusters. If one did not allow for these isolated 
cases to occur in the final number of clusters, it might cause different 
classes to merge. Forcing a smaller number of clusters may put together 
different classes whose signatures are closer than those for the outlying 
bunch. 

The final step in the classification process is another spatial oper- 
ation. . In this, we grow the clusters into the area which had been marked 



as boundary. The "operat Ion assigns each boundary cell to the cluster it 
is closest to. This way we include all the edge cells which v;ere stripped 
off during the initial phase of the spatial clustering. The last process 
is analogous to the. general ization a cartographer makes when creating a 
land use classification map. 

One other fact remains to be mentioned before discussing each county 
image. The spatial sample is the area or the number of resolution cells 
that the Image covers. By increasing the sample size by a factor of four, 
doubling the image vertically and horizontally, the results of the spatial 
clustering are considerably enhanced. This is because each individual 
field has more cells, which gives it a better definition. The increase in 
■sample size is done by expanding the image vertically and horizontally by 
a factor of two. While the results are better, the price paid for this is 
in terms of processing time and memory space. However, for images of this ' 
small size, this increase was not significant. 

For the figures Included in this report, the images were compressed 
down and scaled. This was done to accomodate the size of an ERTS cell as 
well as the uneven printer cell ratio. The compression for display purposes 
is done by omitting selected lines or columns. The selection is determined 
by the compression ratio. All this may result in the omission or drastic 
reduction of some small regions on the printout, even though they were 
included In the processing. The figures are' included here for qualitative 
purposes generally. The character and shape of all the major fields Is 
maintained. Any quantitative assessment is, of course, done at the expanded 
scale. 

2, 1 Rice County Image 

in the last report, the processing of the Rice County image was done 
on the principal component image. This last quarter, the processing v/as 
repeated on the original ERTS images. The Rice County site consisted of 
four Images taken on October 21, 1973, April l8, 197^, June 12, 197^, and 
July 18, 197^. Thus, there were sixteen bands of information. All these 
bands were expanded vertically and horizontally to increase the sample size. 
For the spatial clustering, MSS bands 5 S 7 of the April and June dates 
were selected. Contrast enhancing was performed on each of these bands. 

The spatial clustering procedure was then carried out. It resulted in 



386 homogenous regions which are depicted in Figure 2,1.1. 

As mentioned above, some of the. smaller regions may not be seen 
because of the compression. Also, though each region has a unique label, 
they are shown here with repeating labels. This is because the printer 
only had symbols available. Without resorting to overprinting or 
color, it is not possible to display each region uniquely. The figure, 
however, does illustrate the different regions of the image. The blank 
areas between the fields constitute the boundary section which separates 
the regions. 

It was felt that the spatial clustering was not strong enough, and 
some fields came out connected , when they should not have been. A split- 
ting operation was then performed. In this, the different parts of the 
homogenous regions of Figure 2.1.1, were examined to see if their means 
differed enough to separate them. This checking was done using the four 
MSS bands 5 & 7 of the April and June dates. Under the threshold given, 
the 286 regions were split into 409 regions. These can be seen in Figure 
2 . 1 . 2 . 

The 409 regions were clustered on the bas-is of their spectral signa- 
tures defined by all 16 bands. The series of reduction of the number of 
clusters was 409-202~139“106-6^-47~37, in 6 iterations. In order to see 
some pattern in the clustering, the results of the last four iterations 
are shown in Figures 2.1.3 to 2.1.6. These figures correspond to 106, 6^, 
K} and 37 clusters respectively. 

The result of spatially generalizing Figure 2.1.6 is shown in Figure 
2.T.7. Here the boundary or unclassified cells are assigned category 
labels using the nearest neighbor rule. 

No quantitative analysis could be performed on the classification 
results for Rice County, as was done for the others. It shall be done 
this coming quarter. For completeness sake, we have included some soil 
and ground truth maps for this test site.' Figure 2.1.8 shows the crop 
ground truth map. This will be used for quantitative verification of the 
clustering. The four crop classes shown on this map are wheat, grain 
sorghum, corn and summer fallow. These are denoted by letters A through 
D respectively. 

It should be noted that out of the 37 categories in Figure 2.1.6, 

95^ of the image consists of the six classes, A, C, G, J, K, and X. The 



FILE MAfIS - RlCCHPf.CT 


lAAAAAAAAAAAA 

0 

CCCCCOD 

EEE 

IAAAAAAA 


CC 

cEE 

I AAAA 

TTT 

c 

c 

IA AA AAA 

r 

s cc 

lA AAAAAAA a 
I AAAAAAA 
lA AAAAAA 


s + ♦ 

35 

IBB AAAAAA 


333! 

3533 

IB AAAAAA 

G 

3333 

IS AA AA 

G 

33333333 


.5 4- 5 6— 

FFFFF HKKHI{KHKHI I I M n I n t 


FF MHHHKIUIKHHHH 
FF FFHHHHHHlfHlfHHH 
Z FFFFHHHHHHHilH ( 

Z FfFF HHH <( 

It ===== C 

Z = =:z = == 


lAA 

PPPPP 


3333335533 

1 

P 



33 3 33 

lA 

vuu 

XX 

YT 

33 

lA 

uuuu 

XX 

YT 

3 

IA 

UU 

XXX 


♦ 3333 

1 

UU 

XX 

6 

333 

1 

1 



666 


JJ 

J J JLT 
JJ 


(4 

«444444 
444444 
At 
44 4 
4 44 4 


n 

II 
1 1 
11 
I 
I 


JJJJ <F LLLH.L.'^H'l** 0^00 


X xxxx 

X XXX XXX 
XXXXX XX 

xxxxx 


1 6^6666 

2 6666666 

IH 66666666666 

1 66666666 66 
166666666 
16 66666 


6G 

G 


/ 

/; 4 

BBSS 

SBaes 
e sesL 

0333 

33 


» 

R 

TfH R 

ZZ^StS P R RfiR R 
SS RR4RRRRRRP R 
RR RRR RRRRRRRR 
4 RRR RR PRRRRRRR 

RRRRRR8 RRRR- RRRRR 77 


XX X 
XX XX 
XXX 


XX 

XX 


u 

IJUUUU 

Mil 

1 . 
1 

1 T1 
1 


00*340 
000 
VV 00 
0 
0 

/// 

/ 

c 

c 


RRBRRRR 
RRRR 
RRRR 
RRRR 


l< >) 
I( >))> 

I ) 

I 

1 77 
377777 


* X 


TT 

TT7TT 

TTT 


uuu 

uuuu 

uuu 


RRRRRRR 

RRRRR 


RR RR 
RR RRR 
PRSR 
R 

RRR 

s 
ss 


SS b 


S 5 5 

555555555 5555 
55555 55555 

5555 5555555.5 S 

5555555555555555 


55555555<55555S5<5 
5S555555S5S55555555 
SS55555555 55 555 

5 555 5$ 555 555 

55535555555555 555 

■ 55 5555555 ■ ■ 

555 5555 5 


-P-I 

$SI 

Ul 

VI 

r 

i 

j 

I 

r 

1 

Lit 

till 

tl- 

LI 


555 


EE 

E 


777 


PP 

P 


7 9A 44 S5 

7777777 99 E 

17777777777777777 JJ KK £C 

17777 77771777777 OQOQO J K< 

17 7 777777777777 OQOOOO 

177 777777777777 7777 

17777 7777777777777 777777 

1777 77777777777777 777777 

1777 7777 77777777 777777 

1777 77 777 777777 7 

17? 77 777 7777 777 

47 7 77 7 7 

17777 77 
17777777 «H 

177777? HH 

17 7 

I 

I., == 

1 4 = 

1 A 

X 6333 A 

5 3633 A 

1 B6B AAA 

IPPP A 

IPPP T QQQQO 

I T Q 00 

IZ QOQ 888 

I 555SSSS OQ 8 

155559$^ r~ 

191$9S5 

I /////// 8 

6//////y//// 

1/// G 1 

I GGKH 11 

S I 


//// 

tffft 

iiifnin in 
nnnnt 

p 


R RR 
RRRR 


ANN 
T HK’H 

T < 

(f 

(<(( 
7 

00 


uuvuv 

uwvuuu 

',n “ 

44444 1 

4 4444 3 n C 
4 

hhhhhk 

MHHH H 


n 


SS 


siiimiy 


55555 555S55555555S55S55555 
555 5555555 555 5555555 555 

555 55535555555 33553 355 

$355355535555 5 5555 

555SSS5S35 5 F6 

55 55 LL 

55 5S5 55 HM 

55 5555 55555 
5 555555 

3SS55555 


T 

1 I 
? T ! 
T 

1 003 
I 


.||553555S5 


55555355555 

SS5555 55 555 5355SS 5 

5 555555555$ 5555355555555 
S55555555555 5 5 55 S 5555 

555555 55 555S5 5355 S55S 

5555 55 5555555 555555 

S55 5 5 S5S5S55 $555555555 

555 5555555 5 5555555 ZZ 

5 5555 55555 ♦ fttx 

55 5555555 5 535555 ♦ ZZ 

5 55555355 5555 6 ZZ 

8S?R8S856a 555 ? 55 

88838888883888 55S5555S 

8888888888888 55SSS 

88 8888 B 8 P 

8888 8888888888 683 8888888 8E88 88A 

888888888888 88858S8 858S3 B83888B8888PPP 8 88 


UUUUUU 

UUUU 

U 

u 


>5 

)>> 

)))) 

>> 

88 88 
38 
883 
BR88 
8 


6 

GG 


VV 
VV 

\z 

Z 666 
2 A A 
7 66 

2 A6 


88 88a3P8fiS 32888888888838888 
88888 88883 86S838888888888 

8 88888 38883888838 688 8888 

• 88888 888 3838B8SB 8 

a8888S888|8|8868d888 8388' ' 


zzz 

ZZ 


22 2 
2282222 

766 2 2 22 7 

. 6 • 77 68 

16 C 9999 

16 C L 

cc 

I UUUUU V 

1 (JUUUUUUUUUUUUU V 
lUUUUUUUUUUUUUUUUUU 
1 UUUUUUUUUUUUUUUUUUU 
I uuuuuuuuuuuauuuuuuu 
BUUUUUUUUUU UUUUUU u 
lUUUUUUUUUUUUUU 
1 UUUUU 
lUUUU 

1 C cc ' ••• . 

ICCCCC 0 
1 KKK 

ZH XKK 

IN 

IN TT 

9N WWW 

ItiNUW U 
IN UUk 
INK UU 


0 0 00 
000000 0 

Z ZZ'O 000 0 

zztzzzz 0000000 

IIZZZ 000000 
00 


8 88888 


9999 

9 


88 8.3883338 

8888 833838 
88 JJ 

JJJJ 

PPPPPPP JJ 

555 ♦ 

55S . 


8288 

838 

82 

T 

1 


55 5555 
555555 
555555 
5 

5555 
55 


ZZ 

IZ 


VVV )) 

VVVV V ))>> 
V V VVVV V V >>> 
vvvv vy vvvvvj> 

VV VVWWVVVVVVJ > 
VVVVVVVVVVVVV 
V F 

F 
R 


55 


u 


X ' 


u 

UU LUUUU 
UUUUUUU 
UU tj u 


V 

VVVVV 
( tfVVVVV 
♦ vwvvvvv 

VVVVVVV 
VVW VVVVV 
7 vvvv *A 
*777 VV VVVV AA 
777VVVVVVVV A 
VVVVVVA A 
Y VV VVV A 
^ VV V 

VVVVVVV VVV 
VVVVV VVVVVVV 


VYYVY yy 
VYYYYYYY Y 
= VY 


FFF f 
prn 
FFFl 


>1 


t 

(<((<<CC<(<C<I 


444444 
4 4444 
44- 


XX 

XX 

X 


»P OOP 
QQGQ 

' (< 0000 
( 


c 

OOGG 

GG6 


3 35 VVVVVVVVVVVVV <c(c((f((<<<ca 

3335533353 VVVVVVVVVVVV <(C((C((C( I 

35333 333333 VVVVVVVVVVVV ((((< < 7 

33 33 V VVV <((<< << I 

AAAA 33 ne C ( J 

» H I 3 JJ 0 (C I 

HH J ■ ■ “ 


HH 


4 

44 

4 555 

555355555 


r,G 


E6 


5 555 

555555555 

• 555 S5555S5S5 5 

555555 5555555 555555555555 

555 HH LL 5555555 555 5555555555 555 

5 55 5555 HH LL SSS555 5S5555SSSSS55 

lleeLlll 55 5555 Q , . 5555S53555555555 CO 

55SSS5SSS 55 55 3S5555S 55555 5 UU 

5555555555 5$ 5555555555555 5 XXX555 Y 

5 555555 5S5555 ZZZ 

55 555555555 5 555 ZU2UZ 

5555555555555555555 SS #/.... )))) * IZZU27ZZZZ 

A 2 J--- 4 5 7 ft_ 


TT 

TT 

/ 

/ fitt/n un 
f tn f tnn I tt t 

^niunn unjj 

6 

8 6 


59 5 5<I 

55555 55t j 

S555 5 I 

5 5 YI 

55 55 5555A 
■ 51 


VVVV 


VVV 


55 


/// 7/ 

ft 55555 I 

n 5535 I 

rtf 5T5%55 \ ' 

tn 555 I 

// // 55 I 

tntt 

//// // 5£ M'lMMi 

tlttfftt 5 “»il 

/// ///555 

n/fft I 

nttfiftt I 

ttttntttnmntix 

O H-f 


Figure 2«1.1 Rice County - Homogenous regions 
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Figure 2.1.2 Rice County - “Split" homogenous regions 
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Figure 2.1.3 Rice County - Third Clustering iteration 
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Figure 2.1.4 Rice County - Fourth Clustering iteration 
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Figure 2.1.5 Rice County - Fifth Clustering iteration 
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Figure 2.1.6 Rice County - Final Clustering 
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Figure 2.1.7 Rice County - Spatially generalised clustered image 
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Figure 2.1.8 Rice County - Crop ground truth 
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Figure 2 el, 9 Rice County — "Shrunken" crop ground truth 
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Figure 2,1.10 Rice County - Soil map 



rest of the classes can either be'merged into these six or discarded, as 
they are too small. 

In Figure 2,1.9> the shrunken ground truth of the area Is provided. 

It is the result of spatially shrinking the ground truth map of Figure • 
2.1,8. This gets rid of' the cells on the edges of the field. It is used 
when the signatures of some fields are not very reliable. 

Figure 2.1.10 shows the soil map for this test site. 

2.2 Morton ' County ~ image 

The Morton County image consisted of twenty bands of information. 

There are five ERTS Images taken oh October 23, 1973, May 9, 137^, May 21, 
197^, June l4, 197^, and July 2, 197^^. Again, for the spatial clustering, 
only a few bands were selected to minimize the misregistration error. Each 
of these bands was spatially expanded by two and run through a contrast 
enhancing operation. This was followed by Roberts Distance 1 gradient 
option. After thresholding, cleaning and labelling, the resulting image 
contained 607 regions. Figure 2,2.1 shows us a scaled version of this 
image. The fields and boundaries were brought out vjell by this series of 
operations. Thus, it was not felt necessary to perform a splitting func- 
tion. 

The measurement space clustering was performed using the 10 MSS bands 
5 & 7 of the five dates. All twenty bands could have been used, for 
spatial misregistration is not as critical for this as it is for the 
gradient operation. Only 10 bands were chosen, however to" keep processing 
time small. Figure 2.2,2 shows the clustered Morton County Image. It has 
23 classes, which were the result of clustering the original 607 regions 
of Figure 2.2.1 in 10 iterations. In Figure 2.2.3' we see the final result. 
It is the spatially generalized clustered image. 

While 23 classes seems too many for a small image like this. It should 
be noted that .most of these can be discarded. They constitute a very small 
percentage of the image. As mentioned before, these small classes have 
outlying spectral signatures and do not usually correspond to any useful 
land use class. While it would be interesting to find out exactly what 
they correspond to, it is difficult to do so, because of their small size. 

In Figure 2.2.3, 91^ of the image consists of classes A, B, C, and D. 
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Figure 2.2.'l ’ Morton County ~ Homogenous regions 
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Figure 2,2.2 -Morton County - Clustered image - 23 clusters 
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Figure 2,2,5' Morton County - Crop' ground truth 
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Figure 2.2,6 Morton County - "Shrunken" -crop, ground truth 
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Figure 2.2.7 Morton County ~ Wheat (a) and non-wheat (b) 

ground truth 


By adding the next three largest clusters, E, J, and K, 96% of the image 
is included. If we go up to the 10 .largest classes, it leaves 1.2% of 
the image divided up between the 13 small classes. These 13 classes may 
be ignored. 

The quantitative analysis and an attempt at semi-automatic interpreta- 
tion of these classes is discussed in a later section. It is the hardest 
and the most time-consuming part of this process. Included here are some 
figues giving the ground truth information which was used for interpreta- 
tion. Figure 2.2.4 shows the soil map of the area. Figure 2.2,5 is the 
crop ground truth with 6 categories. These are wheat, grass, corn, summer 
fallow, grain sorghum and rye. They are denoted by the letters A through 
F, respectively. 

Figure 2.2.6 is the shrunken ground truth map for Norton County. 

Figure 2,2,7 gives us a wheat versus the non-wheat map.' Since the deter- 
mination of the wheat yeld is important, the wheat class v:as isolated to 
be checked as a special case, 

2,3 Finney County Clustering 

Most of the processing on Finney County was similar to that done on 
Horton County. This image also consisted of 20 bands of information, and 
only some were selected for the spatial processing. The dates for the 
five images were October 23, 1973, April 20, 1974, May 8, 1974, May 26, 
1974 , and July 1, 1974. Two sets of MSS bands 5 £ 7 of pre- and post- 
harvest dates were chosen. The operations of contrast enhancing, gradient, 
thresholding, cleaning and labelling were performed on these four bands. 
Again, all processing was done on a spatially expanded image. The spatial 
clustering process resulted in 1148 homogenous regions, which are depicted 
in Figure 2.3-1. This is a compressed and scaled version of the original 
file. Like for Horton County, it was felt that a splitting operation V'las 
not necessary. The fields seemed to be separated nicely from each other. 

Measurement space clustering was then carried out using these regions 
and the 10 MSS bands 5 £ 7 of the five original dates. It took 10 iter- 
ations to reduce the 1148 regions down to 29 clusters, which are shown 
in Figure 2.3-2, In the next figure, we have the result from spatial 
generalization of this image. 

Out of the 29 classes, the bulk of the image falls under 6 clusters. 
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Figure 2.3.1 Finney County — Homogenous regions 
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Figure 2,3.2 Finney County — Clustered image 
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Figure 2.3.5 Finney’ County - Crop ground truth 
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Figure 2.3 >6 Finney County — "Shrunken" crop ground truth 
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Figure 2.3,7 Finney County - Wheat; (a) and non-wheat (b)’ 

ground truth 



These occupy 98.2^ of the image. 

The analysis and interpretation of processing the Finney County image 
is done later on. included here are the corresponding ground truth maps 
for this area. Figure 2.3.4 is the soil map of the test site. In Figure 
2,3*5 we have the crop information. There are five classes, shown by 
labels A through E, corresponding to wheat, grass,, corn, summer fallow, and 
grain sorghum. Figure 2.3.6 is the shrunken crop ground truth, while 
Figure 2.3-7 shows the wheat versus non-wheat layout for this area, 

2,4 Saline County' Image 

It was mentioned in the last report that the ground truth of the 
images seemed incorrect in some places. The Saline County test site image 
gaveus an opportunity to check both the clustering process and the ground 
truth. The geography appl ications laboratory at KU had some aerial photo- 
graphy over part of Saline County taken In April of 137^- Mr. Jim Merchant 
of the above lab consented to do some manual interpretation of the unsupet — 
vised clustering product, using this photography. 

Unfortunately, the aerial run did not cover the LACiE test site, but 
was to the north of it. it did, however, overlap with the top half of the 
Saline County image provided on tape by NASA. The LACIE test site occupies 
the bottom right corner of this image. ^ 

The image on the tape v/as 419 rows by 290 columns. The bottom right 
331 rows and 150 columns of this were extracted for processing. This gave 

t 

a vertical strip image with the aerial photograph covering the top half 
of it. 

The analysis of the Image was carried out in two v^ays. First, a com- 
puter printout of MSS band 7 of the April date was generated. This Is 
shown in Figure 2,4.1. Using the aerial photograph and county maps, fields 
and regions were marked out on the printout for ground truth classification. 
These areas were used for both the Bayes classification rules and Inter- 
pretating the unsupervised clustering result. 

First time through, the unsupervised clustering was carried out on all 
twelve bands. However, this result was difficult to match to the printout 
with the ground truths marked on It. To get some idea on the accuracy of 
clustering, the unsupervised clustering was redone, using the 4 bands of 
the April date. These VMere first expanded, which gave an image 662 rows 













Figure 2.^.3 


Saline County - Clustered image 
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,4 Saline County - Spatial generalisation of 


clustered image 



by 300 columns. The spatial clustering was carried out on MSS bands 5 £ 

7 only. The resulting image had 2359 regions, and a compressed version, of 
this is shown in Figure 2.4,2, The clustered image is shovm in the next 
figure. It was obtained using a.l 1 four bands of the Apr i 1- -date. The- 2359 
regions were reduced to 23 classes in 8 iterations. The image was spatiall 
generalized and this result is shown in Figure 2.4.4. 

An interpretation of the cluster labels was carried out using the 
aerial photograph and the ground truth computer printout. The result of 


this is given 

by Table 2.4,1. 


LABEL 

FREQUENCY 

CATEGORY 

A 

18.5^' 

Wheat 

B 

1.7^ 

Wheat 

C 

3.5% 

Rangeland 

D 

18.6% 

Rangeland 

E 

3.1% 

Woodland 

F 

7.0% 

Bare ground 

& 

17.5% 

Bare ground 

H 


Rangeland-grass-pasture 

1 

0.3% ■ 

Rangeland 

J 

2.9% 

Bare ground 

K 

17.1% 

Wheat 

N 

2.7% 

Wheat 

P 

1,0% 

Woodland 

L,M,Q-W 

1,6% 

Not classifiable 


Tabl e 2.4.1 

A few points emerge upon examing the table. The clustering came up 
with no less than 4 labels for wheat, along with multiple labels for 
rangeland and bare ground. At first glance, it would suggest that the 
clustering may have been stopped prematurely. Another iteration or two 
would have merged them together. However, this may not be true. The 
different shapes of wheat or different types of rangeland, may constitute 
the same nominal class, but have different spectral signatures on the 
satellite image. Thus, the computer treats them as separate spectral 



categories. Imposing further, clustering may cause the merging of some 
spectral categories across nominal classes. For example, if one sub- 
class of wheat has a signature closer to a rangeland sub-class than to 
the other wheat sub-classes, in an additional iteration, it may be put 
together with rangeland. 

As a matter of fact, this may already have happened in two instances. 

As may be seen from the, table, there was no water class detected. Hovj- 
ever, on the image printout, there are occurrences of v;ater. One example 
is the meandering river flowing in the center left of the picture. it 
was not expected that the spatial process would pick this river, as its 
width is too thin for the gradient and cleaning operators to resolve. 

That holds true also for some scattered ponds, which consist of a few cells 
each. However, there is one large lake about a hundred cells in size. which 
lies one-third way dov/n from the top and about cells from the right bor- 
der (see Figure 2.1}. 4). It was formed to be classified with a label ”G”, 
which corresponds mostly to bare ground. This indeed is unfortunate, as 
water Is an important class and should have come out separately. The second 
instance for misclassif ication is for an occurrence for label ”E". The,, 
aerial photograph tells us that the area lying just below and to the right 
of the lake is bare ground. The clustering process, however, gave it the 
label for woodland category. 

Other than these cases, the clustering seemed to be quite good. it, 
however, needs to be determined why this misclassif ication occurred.' If 
these misassignments occurred in the last iteration, then perhaps the clus- 
tering went too far, or the user entered parameters that were too liberal. 
This calls for an examination of the clustering process interation by 
iteration for these particular areas. Unfortunately, at this time the 
intermediate signatures of the clusters are not available. It would let 
us see whether the clustering was overdone or if the original signatures 
of the areas v;ere misleading to begin with. Currently, the process is 
being modified to print these intermediate results for such verification. 

Once the labels are associated with classes, so.me relabelling may be 
carried out. This would allow all the sub-classes of the same nominal 
category to be assigned the same label. That then would be the final clus- 
tering result. 

In the table above, the current labels cover about 98.4^ of the image. 



Out of the remaining 1.6%, the label "0" constitutes 1% of the image. It 
seems to be the only unclassified area large enough to be important. How- 
ever, as it occurs in quantity only in the lower half of the image, it was 
di-fficult to find what category it represented. The supporting photograph 
of the ground truth only covered .the top half of the strip. 

The relabelling will be done once the misclassif ications in the image 
have been rectified. 

The human interpretation yielded only one crop class--wheat . There 
are a few reasons for this. Wheat was predominant over the entire strip. 
The other crops occurred only in small areas. Also-, the interpretation 
was made using two bands of the April date, for which only wheat stood out. 

For the April date alone, the classification seemed pretty good, it 
is hoped that similar analysis can be done using other dates, if addi- 
tional aerial photography can be obtained. 

To check the ground truth after relabelling, we can compare the clus- 
tered result with the ground truth of the test site. The fields which are 
of the same class should have the same label in both images. Discrepancies 
can be singled out and their signatures checked. The corresponding areas 
.in the ground truth can then be corrected. 

2.5 Concl us ion 

Although we have finished the clustering, the comparison between the 
clusters and the ground truth has not been completed. During the next 
quarter we will fix any incorrectly given ground truth and report on the 
detailed comparison. It is hoped also that some of the county images will 
be reclustered, using an '.'Isodata" option, in the measurement space clus- 
tering. This is at present being implemented. It, should reduce the oc- 
currencesof mis-ass ignments , that came up for the Saline County image. 

t 



3.0 Spectral -Tempral Classification Using Vegetation Phenology 


The usual model for classification implicitly assumes that the pheno- . 
logical grov;th stage for each vegetation category is the same for all observa- 
tions made at a single time. It is well known, however, that even in a geo- 
morphological ly homogeneous area, the phenologlcal growth stages for each 
vegetation type is not the same. This slop in phenologlcal growth stage is 

then reflected in probability distributions of larger variance than they 

* 

should be. The larger variance causes a lower classification accuracy for an 
optimal decision rule. The solution to this problem is to focus on what 
phenological growth'stages are appropriate for any spectral reflectance for 
each category. 

One classification algorithm which makes use of vegetation phenology' 
has a direct and simple description. For example, If a 2-band spectral 
observation (tt| , a 2 ) is made using wavelengths (X^, X^) at time classifi- 
cation can be done by determining for each category c all those phenological 
grc‘.;th stages of vegetation of category c which can yield spectral return a^' 
at wavelength X^ and spectral returns v/avelength X^. if there Is not 

a phenological growth stage of category c which yields spectral returns 
and at v/avelengths Xj and X^, then category c is not a possible choice. 
Classification may then be done by eliminating inappropriate category choices. 
Spectral observation taken at a later calendar time can be naturally constrained 
to be associated with later phenological growth stages in order to keep an 
earlier accepted possibility of category c remaining open at the later observa- 
tion time. 



3.1 Bayesian Perspective 


The usual model for the classification of remotely sensed data has been 

one relying on simple statistical structure. For example, to discriminate 

corn from wheat, it is assumed that data vectors coming from the corn category 

are distributed according to one probability distribution and data vectors 

coming from wheat are distributed according to another probability distribution 

% 

A Bayes or maximum likelihood criteria then determines an optimal decision rule 
Unfortunately, the world is not as simple as the model assumes. Besides 
atmospheric haze and geomorphologic soil and moisture variations>which un- 
doubtedly affect spectral returns> it Is not the case that data vectors coming 
from corn are distributed according to a simple probability distribution. 
Rather, for each phenological growth stage for corn, spectral data vectors 
coming from corn are distributed according to a probability distribution. The 
statistical model, therefore, requires the probability function of spectral 
reflectance vector x coming from the category c in phenological growth stage g 
at calendar time t. We denote this probability by 
Pj(x, c, g). 

For mul ti -temporal mul ti -spectral data, the probability function of 
spectral reflectance vectors Xp...Xj^ coming from categories c^,...Cj^ in 
phenological grovJth stages gp...,gj^ at calendar times t^,...,tj^, respectively 
is denoted by 


To determine a Bayes rule, the probability 
must be determined. Now, 
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The causal mechanism which produces a reflectance x given a category c at 
phenol ogle growth stage g guarantees that 
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since a vegetative category and its phenological growth stage at time t^ are 
the only determinants of the spectral reflectance x^. LikewisCj the vegetation 
growth mechanism guarantees that 
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Hence, 
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In theory, the formula just derived could be used to determine a Bayes 
rule in the usual way. In practice, there are too many distributions to 
estimate and too many calculations to do to calculate the required probabilities 
However, because the required probability has the form of a product, then if 
any probability in the product is zero, then the product must be zero. And a 
Bayes rule would never make an assignment to a category with a zero probability. 
This fact can be capitalized on to make an efficient table look-up rule. 



3.2 Summary of Work to be Done for Final Quarte r 

Using the corrected ground truth for each test s.ite and the spectral 
time plots of each crop category by field, we will construct spectral-- 
phenologi.cal growth state curves for each category.. .JChen. usi.ng .the. theory 
developed in 3-1, we will implement a table look-up rule which classifies 
using the derived growth state curve. The results of this table look-up 
scheme will be compared with the classical approach. We are expecting 
the classification error to decrease by a factor of three with this 
procedure. 




